14 research outputs found

    Full Page Handwriting Recognition via Image to Sequence Extraction

    Full text link
    We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed text without image segmentation. Being based on an Image to Sequence architecture, it can be trained to extract text present in an image and sequence it correctly without imposing any constraints on language, shape of characters or orientation and layout of text and non-text. The model can also be trained to generate auxiliary markup related to formatting, layout and content. We use character level token vocabulary, thereby supporting proper nouns and terminology of any subject. The model achieves a new state-of-art in full page recognition on the IAM dataset and when evaluated on scans of real world handwritten free form test answers - a dataset beset with curved and slanted lines, drawings, tables, math, chemistry and other symbols - it performs better than all commercially available HTR APIs. It is deployed in production as part of a commercial web application

    Dynamic Feature Selection for Classification on a Budget

    No full text
    The test-time e cient classification problem consists of • N instances labeled with one of K labels: D = {xn 2X,yn 2Y = {1,...,K}} N n=1. • F features H = {hf: X 7! R df F f=1, with associated costs cf. • Budget-sensitive loss LB, composed of cost budget B and loss function `(ˆy, y) 7! R. IHs (Y; hf) a = hf IHs(Y; hf)(Bs 1 2 cf) The goal is to find a feature selection policy ⇡(x): X 7! 2 H and a feature combination classifier g(H⇡):2 H 7! Y such that such that the total budgetsensitive loss P LB(g(⇡(xn)),yn) is minimized. The cost of a selected feature subset H ⇡(x) is C H⇡(x). The budget-sensitive loss LB presents a hard budget constraint by only accepting answers with CH apple B. Additionally, LB can be cost-sensitive: answers given with less cost are more valuable than costlier answers. The motivation for the latter property is Anytime performance; we should be able to stop our algorithm’s execution at any time and have the best possible answer cf Figure 1. Definition of the reward function. We seek to maximize the total area above the entropy vs. cost curve from 0 to B, and so define the reward of an individual action as the area of the slice of the total area that it contributes. From state s, actionhleads to state s 0 with cost cf. The information gain of the action a = hf is IHs(Y; hf)=H(Y; Hs) H(Y; Hs [ hf)
    corecore